Dual-Tree Fast Gauss Transforms
نویسندگان
چکیده
Kernel density estimation (KDE) is a popular statistical technique for estimating the underlying density distribution with minimal assumptions. Although they can be shown to achieve asymptotic estimation optimality for any input distribution, cross-validating for an optimal parameter requires significant computation dominated by kernel summations. In this paper we present an improvement to the dual-tree algorithm, the first practical kernel summation algorithm for general dimension. Our extension is based on the series-expansion for the Gaussian kernel used by fast Gauss transform. First, we derive two additional analytical machinery for extending the original algorithm to utilize a hierarchical data structure, demonstrating the first truly hierarchical fast Gauss transform. Second, we show how to integrate the seriesexpansion approximation within the dual-tree approach to compute kernel summations with a user-controllable relative error bound. We evaluate our algorithm on real-world datasets in the context of optimal bandwidth selection in kernel density estimation. Our results demonstrate that our new algorithm is the only one that guarantees a hard relative error bound and offers fast performance across a wide range of bandwidths evaluated in cross validation procedures.
منابع مشابه
Insights on Fast Kernel Density Estimation Algorithms
We present results of experiments testing the Fast Gauss Transform, Improved Fast Gauss Transform, and Dual-Tree methods (using kd-tree and Anchors Hierarchy data structures) for fast Kernel Density Estimation (KDE). We examine the performance of these methods with respect to data set size, dimension, allowable error, and data set structure (“clumpiness”), measured in terms of CPU time and memo...
متن کاملEmpirical Testing of Fast Kernel Density Estimation Algorithms
We present results of experiments testing the Fast Gauss Transform, Improved Fast Gauss Transform, and Dual-Tree methods (using kd-tree and Anchors Hierarchy data structures) for fast Kernel Density Estimation (KDE). We examine the performance of these methods with respect to data set size, dimension, allowable error, and data set structure (“clumpiness”), measured in terms of CPU time and memo...
متن کاملThe fast Gauss transform with complex parameters q
We construct a fast method, OðN logNÞ, for the computation of discrete Gauss transforms with complex parameters, capable of dealing with unequally spaced grid points. The method is based on Fourier techniques, and in particular it makes use of a modified unequally spaced fast Fourier transform algorithm, in combination with previously suggested divide and conquer strategies for ordinary fast Ga...
متن کاملFast Approximation of the Discrete Gauss Transform in Higher Dimensions
We present a novel approach for the fast approximation of the discrete Gauss transform in higher dimensions. The algorithm is based on the dual-tree technique and introduces a new Taylor series expansion. It compares favorably to existing methods especially when it comes to higher dimensions and a broad range of bandwidths. Numerical results with different datasets in up to 62 dimensions demons...
متن کاملFast Gauss transforms with complex parameters using NFFTs
at the target knots yj ∈ [−14 , 1 4 ], j = 1, . . . ,M , where σ = a + ib, a > 0, b ∈ R denotes a complex parameter. Fast Gauss transforms for real parameters σ were developed, e.g., in [15, 8, 9]. In [12], we have specified a more general fast summation algorithm for the Gaussian kernel. Recently, a fast Gauss transform for complex parameters σ with arithmetic complexity O(N log N + M) was int...
متن کامل